Evaluating the ability of some natural phenolic acids to target the main protease and AAK1 in SARS COV-2

Researchers are constantly searching for drugs to combat the coronavirus pandemic caused by SARS-CoV-2, which has lasted for over two years. Natural compounds such as phenolic acids are being tested against Mpro and AAK1, which are key players in the SARS-CoV-2 life cycle. This research work aims to study the ability of a panel of natural phenolic acids to inhibit the virus's multiplication directly through Mpro and indirectly by affecting the adaptor-associated protein kinase-1 (AAK1). Pharmacophore mapping, molecular docking, and dynamic studies were conducted over 50 ns and 100 ns on a panel of 39 natural phenolic acids. Rosmarinic acid (16) on the Mpro receptor (− 16.33 kcal/mol) and tannic acid (17) on the AAK1 receptor (− 17.15 kcal/mol) exhibited the best docking energy against both receptors. These favourable docking score values were found to be superior to those of the co-crystallized ligands. Preclinical and clinical research is required before using them simultaneously to halt the COVID-19 life cycle in a synergistic manner.


Methods
In this study, molecular operating environment (MOE, 2019.0102) software was used for pharmacophoric generation, searching and molecular docking [24][25][26] , energy minimized structures were gained by applying MMFF94x force field until RMSD gradient of 0.1 kcal mol −1 Å −1 was reached. The protein data bank (PDB) was the source of the X-ray crystallographic structures of SARS CoV-2 main protease enzyme (Mpro) (ID: 7AEH) and adaptorassociated protein kinase 1 (AAK1) (ID: 4WSQ). Triangle matcher as a placement method and London dG as a scoring algorithm were used to determine the pharmacophoric properties of both enzymes and the binding site in each protein file using the co-crystallized ligand. AutoDock Vina 1.1.2. Software was also used for conducting the molecular docking study to confirm the results. SwissADME was used to evaluate the ADME properties of the promising compounds 27 .
Pharmacophore mapping. SARS CoV-2 main protease enzyme (Mpro). The binding characteristics for the co-crystallized ligand were chosen, stored, and utilized to search for potential matches from the phenolic acids database using the PDB file (ID: 7AEH) that was obtained from the protein data bank.
Adaptor-associated protein kinase 1 (AAK1). The PDB file (ID: 4WSQ) was obtained for mapping the pharmacophoric characteristics of AAK1 and examined to determine the most crucial binding properties. For the purpose of choosing the most appropriate phenolic acid structures, the created pharmacophore was preserved and utilized. www.nature.com/scientificreports/ Adaptor-associated protein kinase 1 (AAK1). PDB file (ID: 4WSQ) was used for performing molecular docking simulations, the tested compounds were docked at the binding pocket of the enzyme.

Molecular dynamics simulations.
The best docking postures were subjected to molecular dynamics (MD) computations to get insights into the stability of the protein-ligand interaction. The CHARMM-GUI solution builder created the input files for the MD calculations using the CHARMM force field parameters for proteins 28 . The simulation box dimensions of Mpro/Mpro-16 and AAK1/AAK1-16 were 98 × 98 × 98 nm and 132 × 132 × 132 nm, respectively, containing four sodium ions and seven chloride ions. The protein-ligand combination was subjected to the CHARMM36 forcefield. Prior to production simulation, the system's energy consumption was minimized using the steepest descent algorithm (5000 steps). The complex was then subjected to NVT and NPT ensemble, simulating for 125 ps at 300.15 K temperature utilizing 400 kJ mol −1 nm 2 and 40 kJ mol −1 nm 2 positional restrictions on the backbone and side chains, respectively, to equilibrate the complex for stabilizing its temperature and pressure. The complex is then put through a 100-ns production simulation run in an NPT ensemble at 300.15 K and 1 bar 29 .
ADMET study. SwissADME (http:// www. swiss adme. ch/ index. php) web server was used the study the ADME properties of the promising compounds, it can evaluate the pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. The smiles of the selected library were generated and then the ADME properties were calculated. Lazar toxicity predictor web server (https:// lazar. in-silico. ch/ predi ct) was used to estimate the toxicity profile of (16) rosmarinic acid against Mpro and (17) tannic acid against AAK1.

Results and discussions
Pharmacophore mapping and molecular docking on SARS CoV-2 main protease (Mpro). Pharmacophore generation. 3D Pharmacophores represent the ensembles of the chemically defined interactions of the bioactive conformation of the ligand, accordingly, pharmacophore generation represents an elegant way to decode the chemically encoded ligand information and so becomes a valuable tool in drug design 30 Molecular docking on SARS Cov-2 Mpro. Molecular docking study was carried out for the hit molecules, their binding energy scores are summarized in Table 1, compared with the co-crystallized ligand (− 10.52 kcal./mol.).
According to the binding energy scores in Table 1  www.nature.com/scientificreports/   Molecular docking against AAK1. Molecular docking study was carried out for the hit molecules, their binding energy scores are summarized in Table 2, compared to the co-crystallized ligand (− 13.28 kcal./mol.).
Analysis of molecular docking results revealed that all three tested compounds have the main binding interaction with CYS-193 34 . Compound (17), tannic acid showed the highest binding energy score (− 17.15 kcal./mol.) with strong H-bond interactions through its hydroxyl groups to CYS-129, GLU-180 and CYS-193 amino acid residues. On the other hand, compound (16), rosmarinic acid showed H-bond interaction through its carboxylic acid OH group to CYS-193 in addition to binding of phenolic OH group to ASP-127 in the binding vicinity.   www.nature.com/scientificreports/ Simulation time over 50 ns. Analysis of the root mean square deviation (RMSD). The root mean square deviation (RMSD) was investigated quantitatively to assess the degree of divergence of each complex protein structure with each ligand from its baseline behaviour. The system's stability is evaluated during the simulation with the help of the RMSD. In two different MD simulations, a control system (a ligand-free structure) and a complex were put up for this 38 . The stability and convergence of compound 16 in the target Mpro and compound 17 in the target AAK1 were investigated using a 50-ns molecular dynamics (MD) simulation, where the backbone atoms' RMSD value was obtained, as shown in Fig. 12. The findings indicated that the complex-maintained equilibrium for the course of the experiment. The RMSD values of the apoprotein and the compound 17-bound complex were 0.14-0.36 nm. Furthermore, the RMSD values of the apoprotein and the compound 16-bound complex ranged from 0.12 to 0.23 nm. Compounds 16 and 17 behaved consistently inside their pockets throughout the simulation and went farther toward the binding pocket. This could explain why both 16 and 17 have potent inhibitory effects against Mpro and AAK1, respectively.
Analysis of the root mean square fluctuation (RMSF). The local changes in the protein structure brought on by the presence of the suggested inhibitor were examined using the root mean square fluctuation (RMSF) 39 . Throughout the simulation period, it showed how flexible the protein was. The ranges of 0.03-0.48 nm and 0.15-1.2 nm showed the highest variation. The similar residues in the compounds 16 or 17-bound complexes were less flexible than the native, unbound targets in general. These low-fluctuating residues contributed to the stability of the docked molecules at the binding site (Fig. 13).   www.nature.com/scientificreports/ Analysis for the radius of gyration (R g ). The radius of gyration provides information regarding the size and compactness of protein molecules (Rg). The Rg can be used to track the folding and unfolding of protein structures when ligands are bound 40 . Rg values for the drug-bound complexes were typically closer to the native unbound Mpro and AAK1 values (Fig. 14). Compound Figure 14 illustrates how, as compared to the unbound protein, each complex had very identical properties in terms of compactness and almost constant values of Rg.
Analysis of solvent-accessible surface area (SASA). Both with and without ligands, the protein's solventaccessible surface area (SASA) was studied. The protein-ligand complex's SASA computation is used to forecast the number of conformational changes that the aqueous solvent can access 41 . As a result, the SASA was used to evaluate interactions between the complex and the solvent throughout the 50-ns MD simulation. For the unbound protein and protein-ligand complexes, Fig. 15 shows the SASA vs. simulation time curve. In addition  www.nature.com/scientificreports/ to compound 17 and AAK1, the SASA averages for compound 16 and Mpro ranged from 141 to 155 nm 2 and 290 to 320 nm 2 , respectively. Each compound binding caused the SASA to slightly increase due to the expanded surface generated by a portion of the bound ligand surface poking out from the protein surface.
Analysis of hydrogen bond. The protein-ligand complex is stabilized by hydrogen bonds that form between the receptor and ligand. It also impacts the design of medications and their specificity, metabolization, and adsorption 42 . Therefore, the hydrogen bonds in each ligand-protein combination were studied. Figure 16 displays the total number of hydrogen bonds discovered in the complex after a 50-ns simulation. Each complex had between one and six hydrogen bonds, with two of them remaining constant over the course of the simulation. Additionally, each compound showed a similar hydrogen-bonding pattern throughout the simulation, as seen in Fig. 16.
Simulation time over 100 ns. Analysis of the root mean square deviation (RMSD). When the simulation period was extended to 100 nm, we found that the complex still maintained equilibrium for the course of the experiment (Fig. 17). The RMSD values of the apoprotein and the compound 17-bound complex reached  Analysis of the root mean square fluctuation (RMSF). Throughout the extended simulation period to 100 ns, it still showed how flexible the proteins were. The similar residues in the compounds 16-and 17-bound complexes were less flexible than the native, unbound targets. These low-fluctuating residues contributed to the stability of the docked molecules at the binding site (Fig. 18).
Analysis for the radius of gyration (R g ). Rg values for the drug-bound complexes were still typically closer to the native unbound Mpro and AAK1 values. Figure 19 illustrates how, as compared to the unbound protein, each complex had very identical properties in terms of compactness and almost constant values of Rg.  www.nature.com/scientificreports/ Analysis of solvent-accessible surface area (SASA). Each compound binding still caused SASA to increase slightly due to the expanded surface resulting from a portion of the binding surface bound from the protein surface (Fig. 20).
Analysis of hydrogen bond. The complex formed by compound 17 with AAK1 (in black color) still had between one and six hydrogen bonds, with two of them remaining constant over the course of the simulation  www.nature.com/scientificreports/ (100 ns). But the complex formed by compound 16 with Mpro (in red color) had between one and three hydrogen bonds, without making any hydrogen bonding around 30-70 ns period (Fig. 21).
Principle component analysis (PCA). The conformational distribution during the simulation time was investigated using the PCA approach, as well as the large-scale collective motions of the protein in protein-ligand complexes on the simulation-generated trajectories. It was assumed that the complex that takes up less phase space with a stable cluster is more stable than the complex that takes up more space with a nonstable cluster 43 . During simulations of compound 16 bound to Mpro and compound 17 bound to AAK1 proteins, the first two principal components (PC1 and PC2) were chosen to investigate their projection of trajectories in phase space. The results clearly showed that Mpro-16 complex occupied smaller regions of phase space than AAK1-17 complex (Fig. 22).
Binding energy estimation by MM/PBSA method. The Molecular Mechanics/Poisson Boltzmann Surface Area (MM/PBSA) method throughout the 100 ns simulations was selected for rescoring complexes because it is the fastest force field-based method that computes the free energy of binding, as compared to the other computational free energy methods, such as free energy perturbation (FEP) or thermodynamic integration (TI) methods. The MM/PBSA calculation was performed using g-mmpbsa software. The calculated binding free energies are shown in Table 3   www.nature.com/scientificreports/ ADME study. Selected compounds that showed good binding activity to both tested enzymes (9, 16, 17, 20, 21, 22 and 26) were studied and tested through the SwissADME web tool. First, we studied the bioavailability through a radar chart that tests six parameters, Lipophilicity (LIPO), Size, Polarity (POLAR), Insolubility (INSOL), Unsaturation (UNSAT) and Flexibility (FLEX). Compounds 22 and 26 showed no violation, compounds 9, 20 and 21 showed only one violation in the chart, while compounds 16 and 17 showed more than one violation due to their high hydrophilic characters of both compounds revealing their expected oral bioavailability (Fig. 23). The SwissADME server also provides a BOILED EGG chart to indicate the human intestinal absorption (white part), blood-brain barrier penetration (the yellow part) 44 , and the probability of the tested compound acting as a substrate for permeability glycoprotein (PGP) which is an efflux pump for many drugs (blue color if it's a possible substrate or red color if it's not) 45 . Only three of the tested compounds showed good gastrointestinal absorption (compounds 9, 22 and 26), which exhibited a good balance between lipophilic and hydrophilic characters. Other tested phenolic acid derivatives exhibited poor gastrointestinal absorption due to their high hydrophilic characters, especially for tannic acid (compound 17), which did not appear in the chart due to extreme hydrophilic properties. All the seven tested compounds are not substrates for PGP, so they will not be susceptible to cell efflux (Fig. 24).
Eventually, applying Lipinski's rule of five 46 to predict the probability of the tested compounds being orally active, all showed no violations in the four parameters (log P, molecular weight, number of H-bond donor groups and number of H-bond acceptor groups) except for compound 17 (three violations) and compound 20 (one  www.nature.com/scientificreports/ violation), this was because compound 17 (tannic acid) is extremely hydrophilic with high molecular weight ( Table 4).
The overall data shows that compounds 16 (rosmarinic acid) and 17 (tannic acid) require structural modifications to improve their pharmacokinetic properties and to be suitable for oral administration. Toxicity profile of our most promising natural acids 16 and 17 against Mpro and was investigated through lazar toxicity predictor, the obtained results revealed that both (16) rosmarinic acid and (17)     www.nature.com/scientificreports/

Conclusion
The objective of this study was to find natural phenolic acids that can target SARS-CoV-2's main protease (Mpro) and adaptor-associated protein kinase 1 (AAK1) enzymes. The researchers employed several molecular modelling techniques such as pharmacophore mapping, molecular docking, molecular dynamics, and ADME studies. Rosmarinic acid (16) and 2,5-dihydroxybenzoic-5-O-glucoside (20) demonstrated remarkable binding affinities for both target enzymes and, as such, could act as dual inhibitors. Tannic acid (17) showed good potential in fitting all the required pharmacophoric features in AAK1 with the lowest docking score. These results were confirmed through pharmacophore mapping, molecular docking, and dynamic studies over 50 ns and extended to 100 ns.  . BOILED-EGG chart for the seven tested compounds (Yellow area is blood-brain barrier; BBB, white area is human intestinal absorption; HIA, red circles mean that these compounds are non-substrate for PGP). www.nature.com/scientificreports/ In terms of the ADME study, Rosmarinic acid has low bioavailability due to its high polarity and thus requires structural modification to improve its bioavailability as a promising dual target for COVID-19 prevention via main protease inhibition and endocytosis inhibition.

Data availability
All data generated or analyzed during this study are included in this article.